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Host-associated microbiota varies in distribution depending on the body area inhabited. Gut microbes are 
known to interact with the human immune system, maintaining gut homoeostasis. Thus, we studied 
whether secreted-IgA (S-IgA) coat specific microbial taxa without inducing strong immune responses. To 
do so, we fractionated gut microbiota by flow cytometry. We found that active and S-IgA-coated bacterial 
fractions were characterized by a higher diversity than those observed in raw faecal suspensions. A long-tail 
effect was observed in family distribution, revealing that rare bacteria represent up to 20% of total diversity. 
While Firmicutes was the most abundant phylum, the majority of its sequences were not assigned at the 
genus level. Finally, the single-cell-based approach enabled us to focus on active and S-IgA-coated bacteria. 
Thus, we revealed a microbiota core common to the healthy volunteers participating in the study. 
Interestingly, this core was composed mainly of low frequency taxa (e.g. Sphingomonadaceae). 

The human gastrointestinal tract (GIT) hosts a complex bacterial community which, for decades, has been 
subject to extensive study involving both culture-dependent and -independent methods 14 . The GIT micro- 
biota is acquired very early in a month-scale laps from the time of delivery of the baby 5 . Together with this 
very short time scale, the "inoculum" is determined by the mode of delivery (vaginal or C-section), by the mode of 
feeding (breast milk or formula) and by environmental factors at the time of delivery, among others 6 ". 

Most likely, very few organisms of maternal origin will be maintained in the newborn after the first month of 
life 9,10 ; after this short time, the intestine already hosts its own microbiota, which undergoes complex bacterial 
turnover until reaching a stable composition that is then fairly constant for life 11 . To achieve this homoeostasis, 
innate and adaptive immune responses work together. Growing evidence shows a parallel evolution of the 
immune system with members of gastrointestinal microbiota 1214 . Thus, since the first day of life, Toll-like 
receptors (TLR1 to TLR9) are adapted and can recognize specific microbial structures such as lipopolysaccharide 
(LPS) 15 . Lotz and collaborators described how new-born mice go through a process of tolerance acquisition to LPS 
expressing bacteria within one hour from delivery; this process contributes to establishing microbiota-host 
homoeostasis early after delivery 16 . In a very well-designed study, Giraud and collaborators demonstrated that 
Escherichia coli MG1655 inoculated in 8 tol2-week old gnotobiotic mice evolves rapidly throughout a fitness 
process based on mutations and adaptation to the new environment, altering the EnvZ/OmpR operon in 90% of 
bacteria harvested from independent mice faeces after wild-type "inoculum" 17 . On the other hand, the immune 
system has to adapt to microbial invasion, avoiding complete annihilation of the intruders, which have to be 
recognized as the second part of the mutualistic relationship. These interactions are maintained throughout life. 
In this scenario, the immune system is continuously stimulated, although controlled, avoiding abnormal res- 
ponses to commensal or potential pathogens which pass through the GIT. For instance, some bacteria such as 
Sphingomonas members have been identified as players in the maintenance of immune response producing 
antigenic glycosphingolipids which have been correlated with the expansion of invariant natural killer T cells 18 . 

Already in 1987, Conley & Delacroix estimated that in the human intestine approximately 40 mg/kg (body 
weight) are produced every day 19 . It was estimated that in healthy conditions about 36% of GIT microbiota is 
covered by IgA and that this number can increase up to 69% during inflammatory processes 20 . 

A hot research topic involves understanding the role of dominant bacteria and their relationships with rare 
ones. In a previous work, we highlighted that low frequency taxa are present and are made mainly by active 



SCIENTIFIC REPORTS | 3 : 3515 | DOI: 1 0. 1 038/srep035 1 5 



1 



bacteria whose diversity is statistically hidden by the dominant 
majority 21 . The gastrointestinal tract is the richest body area in terms 
of microbial groups. The most represented taxa are also the most 
commonly shared between different samples. However, most of the 
yet-unknown taxa belong to the under-represented fraction of the 
gut microbiota 22,23 . 

While next-generation DNA sequencing methodologies shed light 
on the enormous variety of bacteria inhabiting the human body, the 
direct cell-by-cell understanding of which the active players are and 
which of them interact with the immune system still represents a 
research challenge. Previously, Van Der Waaij and collaborators 
(1994) described a method for human faecal microbiota sorting based 
on fluorescent labelling of anti human IgA 24 . We used an interdis- 
ciplinary single-cell approach involving RNA cell staining as a mea- 
sure of activity 25,26 , anti human IgA fluorescent hybridisation, flow 
cytometry and cell sorting, 16S rDNA amplification and deep pyr- 
osequencing to shed light on different segments of gut microbiota, 
such as the potential activity of under-represented bacteria. We 
observed that a long tail of low frequency taxa (below 1% in richness) 
could involve up to 20% of the active population. We also analysed 
active and S-IgA-coated bacteria from healthy volunteers which had 
not received antibiotic treatment and were apparentiy healthy. Our 
aim was to determine whether IgA were coating specific taxa. We 
identified several groups, such as members of Actinobacteria or 
Sphingobacteria for which there are already hypotheses regarding 
their role and activity, confirmed in murine models as active players 
in gut microbiota and/or as part of the S-IgA-coated fraction. 

Results 

Sequencing and diversity overview. From each sample and fraction 
we obtained an average of 6,980 sequences over 200 bp (min: 727; 
max: 14,359 reads). The average read length was 422 bp ( + / 
— 89 bp). The number of reads per sample/fraction, as well as the 
number of particles (cells) sorted by flow cytometry and the ratio 
with respect to the total number are reported in Table 1. As can be 
observed, the fraction of active cells ranged from 34% (sample V2) to 
66% (V4) with respect to the total community and the proportion of 
S-IgA-coated bacteria ranged from 21% (V4) to 98% (V3). 

Diversity analysis at the genus level showed that FS fractions pre- 
sent a lower evenness with respect to almost all Act samples (V2, V3, 
V4, V5 and V6) and all IgA fractions, as revealed by the means of the 



Shannon index. The richness estimator Chaol detected different 
behaviours of FS with respect to Act and IgA, being lower in samples 
VI, V2, V3, V5 and V6 and higher in V4 (see Table 1). 

Rarefaction curves flatten at the family level, and all FS fractions 
flatten earlier than Act and IgA fractions (see Figure 1). Analysis of 
family distribution revealed that Act and IgA fractions tend to cluster 
together (Figure 2, panel A). Moreover, Act or S-IgA-coated bacterial 
fractions are not representative of the whole community; neither 
could they be considered a subset of it because of the great richness 
and evenness of species found. Sample VI showed a different micro- 
bial distribution with respect to the other samples, as also shown in 
the cluster analysis where VI fractions behave as an outgroup with 
respect to the other samples/fractions (Figure 2, panel A). At the level 
of genera, canonical correspondence analysis among all samples col- 
lected showed that two major components yield 43.62% and 17.60% 
of the observed diversity (Figure 3). As expected, sample VI is the 
most divergent and also noteworthy is the clear separation between 
the three fractions. These results are statistically corroborated by the 
analysis of variance, using distance matrix by "Adonis" (Permuta- 
tional Multivariate Analysis of Variance Using Distance Matrices) 
known also as "Permutational Manova" 27,28 and applying Bray- 
Curtis distance calculation at the genus taxonomic rank. The analysis 
showed significant p-values for both fractions and sample grouping 
criteria (Figure 3 and Supplemental Information, Table 1). 

To classify low- and high- frequency families, we applied a method 
based on average inflection point, as previously described 21 . The 
frequency distribution per families considering the whole dataset 
had mean and median values of 1.60% and 0.1%, respectively (see 
Figure 2 panel B), confirming the numeric importance of rare fam- 
ilies. Overall mean values per fraction showed that dominant families 
represented 97.1 1% (standard deviation: 1.1), 36.37% (sd: 20.39) and 
43.46% (sd: 33.54) of families in FS, Act and IgA fractions, respect- 
ively, while rare families (frequency below 1%) among all samples 
represented on average 2.89% (s.d.: 1.1), 5.81% (s.d.: 2.41) and 6.21% 
(s.d.: 2.08) of families in FS, Act and IgA fractions, respectively (see 
Figure 2, panel C). Thus, the low-frequency taxa were extraordinarily 
diverse and diversity is characterized, as expected, by the "long-tail" 
effect. 

OTUs 97 distribution and core microbiota. Using an analogy with 
the pan-genome concept 29,30 we can define "pan-microbiota" by 



Table 1 | Details of samples/fractions. For each sample and fraction columns describe for each sample and fraction, the number of sorted 
cells, the proportion of sorted cells with respect to total number, the number of genera, the Shannon index of diversity, the Chaol richness and 
its standard deviation, respectively 



Sample 


Fraction 


Sorted cells 


Fraction rate 


Reads 


Genera 


Shannon 


Chaol 


SE.Chaol 


VI 


FS 


NA* 


1 


6636 


77 


3.02 


89.75 


9.09 




IgAPos 


51010 


0.1 


4483 


101 


3.56 


1 12.77 


8.32 




PYPos 


28052 


0.06 


14359 


107 


2.47 


120 


1 1.71 


V2 


FS 


NA 


1 


10327 


74 


2.59 


97.75 


17.56 




IgAPos 


217181 


0.8 


6653 


96 


3.02 


113 


12.75 




PYPos 


142527 


0.34 


13222 


104 


2.8 


1 12.08 


6.34 


V3 


FS 


NA 


1 


10869 


70 


2.24 


83.91 


10.04 




IgAPos 


206999 


0.98 


3248 


92 


2.82 


109.25 


10.26 




PYPos 


34087 


0.45 


2105 


92 


3.24 


101.71 


7.05 


V4 


FS 


NA 


1 


9922 


66 


2.23 


83.5 


1 1.54 




IgAPos 


19745 


0.21 


974 


69 


3.17 


81.21 


8.32 




PYPos 


206352 


0.66 


699 


41 


2.5 


44.5 


4.84 


V5 


FS 


NA 


1 


10205 


41 


1.88 


50.33 


16.49 




IgAPos 


94701 


0.32 


1612 


46 


2.33 


61.6 


16.43 




PYPos 


242980 


0.61 


4348 


60 


1.98 


81 


20.9 


V6 


FS 


NA 


1 


1 1737 


50 


2.17 


72.5 


59.58 




IgAPos 


21672 


0.57 


2657 


78 


3.02 


91.6 


10.26 




PYPos 


1 30000 


0.41 


9084 


87 


2.82 


92.14 


5.88 



*DNA from FS fractions was obtained directly from faecal suspension without flow cytometry steps, see Materials and Methods. 
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identifying sequence clusters shared among samples (a "core" set of 
bacteria) as well as clusters specific to each sample. 

Comparative analysis of species-level operational taxonomic units 
(sequences at 97% similarity, 80% overlapping, hereinafter OTUs 97 ) 
revealed that among all samples, 28, 21, and 69 OTUs 97 were shared, 
respectively, among IgA, Act and FS fractions (see Figure 4). 
The shared diversity of the FS core comprised only members 
of highly abundant families of Firmicutes and Bacteriodetes, while 
in Act and IgA fractions, members of Proteobacteria were also 
found, such as Alphaproteobacteria mainly belonging to Sphin- 
gomonadales, Gammaproteobacteria belonging to Pseudomona- 
dales, and Betaproteobacteria belonging to Burkholderiales; in the 
IgA core we also found members of Actinobacteria, which were 
absent in the FS (see Table 2). 

A less restrictive analysis describing the core bacteria made by 
clusters represented in at least 5 out of 6 samples showed that the 
FS core was almost invariable, adding only members of Streptococca- 
ceae. The Act fraction also included members of Xanthomonadaceae 
from Gammaproteobacteria; Microbacteriaceae and Nocardiaceae 
from Actinobacteria (see Table 2). 

Taxonomy analysis. Unassigned genera. Genera were considered 
unassigned when annotated by RDP classifier with confidence value 
score lower than 0.8. These were annotated using upper taxonomic 



rank levels. In all samples a very high rate of unassigned genera was 
found. In FS we found that unassigned genera ranged from 24.77% of 
reads in V6 sample to 44.56% in V4 sample. In the Act fraction, the 
percentage of unidentified genera was more uniformly distributed, 
ranging from 8.27% (V3) to 18.83% (V4), except for sample VI 
(0.46%). In S-IgA-coated fractions, unidentified genera ranged from 
2.45% in VI to 35.01% in V2. 

The number of OTUs 97 belonging to unassigned genera was of 
9,122 out of 14,558. These data are shown grouped by main classes 
in Supplemental Information, Figure 1. 

Firmicutes. As expected, Clostridiales was the most populated order 
not only within Firmicutes but also considering the rest of phyla. 
Overall, among taxa with frequencies equal to or greater than 1% we 
found: Lachnospiraceae (including: Lachnospira, Roseoburia, Incertae 
Sedis XIV, Blautia); Ruminococcaceae (Faecalibacterium) and other 
unidentified Clostridiales (see Supplemental Information, Figure 2 - 
Firmicutes). Within the Bacilli class, we found several low-frequency 
(less than 1% in frequency) families and genera belonging only to Act 
or IgA fractions. Within Lactobacillales, members of Enterococcus, 
some unidentified Enterococcaceae and Lactobacillus were found to 
be common to all samples, at least in Act and IgA fractions. Within 
Streptococcaceae, Streptococcus is the one shared among all samples 
(see Supplemental Information, Figure 2 - Firmicutes). 
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Figure 1 | Rarefaction curves calculated for each sample/fraction. FS: faecal suspension. Act: active fractions. IgA: S-IgA-coated bacterial fractions. Every 
plot corresponds to a volunteer. X axis represents the number of sequences while Y axis represents the number of families encountered. 
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Figure 2 | Frequency distributions at family taxonomic rank level. Panel (A) shows clustering of samples/fractions at family level using Bray-Curtis 
distance matrix and complete linkage clustering method. Panel (B) shows the distribution of all frequency values calculated at family level, Y axis is 
represented in logarithmic scale. Panel (C) shows aggregation of dominant and rare families, whose abundance values are equal or higher than 1% or 
lower than 1% respectively. Bars define samples while bar groups define fractions (see legend above the panel). 



Bacteroidetes. Bacteroidetes was the second most recruited phylum, 
in which the genus Bacteroides was found among all but sample VI . It 
was highly recruited from the FS fraction, accounting for up to 
38.86% of total reads in sample V6 (19.81%, 15.21%, 13.93%, 
7.80% in V5, V3, V4 and V2 respectively). Several groups of 
Bacteroidetes were mainly recruited from the FS fractions, for 
example, unidentified Porphyromonadaceae were found uniquely 
in FS fractions from all but V5 samples. Prevotella was also one of 
the universally recruited groups in Act (all samples), IgA (samples 
VI, V2, V3 and V5) and FS (samples VI, V2, V3, V4) fractions. 
Parabacteroides was similar to Prevotella; it was found in FS (all 



but V6), Act (all samples) and IgA (all but V5) fractions (see 
Supplemental Information, Figure 2 - Bacteroidetes). 

Proteobacteria. Almost all Proteobacteria have been found as part of 
active and/or S-IgA-opsonised microbiota. 

Only Sphingomonas, member of Alphaproteobacteria, was found 
at frequency higher than 1% at least in the IgA fraction of VI, V2, V3 
and V4; although they were also present in less than 1% in V6 (Act 
and IgA). Pseudomonas was also highly recruited in Act and IgA 
fractions of samples V3, V4 and V6. No Proteobacteria over 1% 
in frequency was found in sample V5. Several families of 
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Figure 3 | Canonical correspondence analysis. The distribution has been obtained from the Bray Curtis distance matrix, calculated with the genus 
contingency table. Axes represent distance values on coordinates 1 and 2. Red lines and labels separate samples, light blue lines separates fractions. Gray 
labels with "Ssample.Fraction" indications describe the distribution of each experiment. Pink labels represent the distribution of genera. 



Sphingomonadales were found to be common to all samples, such as 
Porphyrobacter and unassigned Erythrobacteraceae (all but V4 and 
V5), Novosphingobium (all samples), Sphingomonas (all but V5) and 
other unidentified families (see Supplemental Information, Figure 2 - 
Proteobacteria). 

Within Betaproteobacteria, order Burkholderiales, family Sutturel- 
laceae, Parasutterella was commonly retrieved from all fractions. 
Also Sutterella from the same family was found in all but V4 and 
V5 samples. Within Comamonadaceae, Pelomonas was found in all 
samples but never in FS fractions. 

Some members of Deltaproteobacteria, mainly related to Desulfovi- 
brionales were found, but without any kind of pattern and mainly 
present in FS fractions. 

Gammaproteobacteria such as Escherichia/Shigella were found in 
low frequency among all samples: in FS (V5 and V6 samples), Act (all 



samples) and IgA (VI, V2, V5 and V6). Other Enterobacteriaceae 
such as Morganella was found in the IgA fraction of all samples. 
Interestingly, members related to Alcanivorax from Oceanos- 
pirillales, were found in all but V4 samples at least in the IgA fraction. 
Within Pseudomonadales, Acinetobacter was found in all but V2 
samples in Act and/or IgA fractions. Pseudomonas was also one of 
the common genera retrieved from all Act and IgA fractions, but not 
from FS. Also Stenotrophomonas from Xanthomonadales was found 
in all but V2 samples from Act or IgA (or both) fractions. 

Actinobacteria. No genera belonging to Actinobacteria were found at 
a frequency higher than 1%. Within Actinomycetales, Microbacte- 
rium from Microbacteriaceae was found to be commonly present in 
all samples in Act (except for VI) and/or IgA but never in the FS 
fraction. The same trend was observed for Rhodococcus from 




Figure 4 | Venn diagrams of species-level OTUs 97 distributions. Graphs show the unique and shared OTUs in FS, Act and IgA fractions, respectively. 
Sample code is reported close to each area. Numbers in central overlapping areas refer to the shared OTUs' 7 in the core microbiota. 
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Table 2 


Coremicrobiota. Columns 


describe taxonomic ranks ofcore 


OTUs 97 of each fraction among a 


II samples. Asterisks indicate families 


found in 


at least 5 out of 6 samples 




Fraction 


Phylum 


Class 


Order 


Family 


PQ 
ro 


■j 

Bacteroidetes 


... 

Bacteroidia 


Bacteroidales 


Bacteroidaceae 




Firmicutes 


Clostridia 


Clostridiales 


Lachnospiraceae 

Ruminococcaceae 

ud-Clostridiales 






Bacilli 


Lactobacillales 


Streptococcaceae* 




ud-Bacteria 


ud-Bacteria 


ud-Bacteria 


ud-Bacteria 


ACT 


Actinobacteria 


Acti nobacteria 


Acti nomyceta les 


Microbacteriaceae* 








Nocardiaceae* 




Bacteroidetes 


Bacteroidia 


Bacteroidales 


Bacteroidaceae 




Proteobacteria 


Alphaproteobacteria 


C L ■ J 1 

opningomonaaaies 


Erythrobacteraceae 

Sphingomonadaceae 

ud-Sphingomonadales 






Betaproteobacteria 


Burkholderiales 


Comamonadaceae 






Gammaproteobacteria 


Pseudomonadales 


Pseudomonadaceae 






Xanthomonadales 


Xanthomonadaceae* 




rirmicutes 


Clostridia 


Clostridiales 


Ruminococcaceae* 

Lachnospiraceae 

ud-Clostridiales 






D ~ _ : 1 1 : 
DOCIIII 


■ ... , 
Lactobacillales 


Streptococcaceae* 


l« A 

IgA 


■ L 

Actinobacteria 


Acti nobacteria 


Acti nomyceta les 


Nocardiaceae 


Bacteroidetes 


Bacteroidia 


Bacteroidales 


Bacteroidaceae 




Proteobacteria 


Alphaproteobacteria 


Sphingomonadales 


Sphingomonadaceae 
ud-Sphingomonadales 






Betaproteobacteria 


Burkholderiales 


Comamonadaceae 






Gammaproteobacteria 


Pseudomonadales 


Pseudomonadaceae 




Firmicutes 


Bacilli 


Lactobacillales 


Enterococcaceae 






Clostridia 


Clostridiales 


Ruminococcaceae 
Lachnospiraceae 






Bacilli 


Lactobacillales 


Streptococcaceae* 



Nocardiaceae and Propionibacterium from Propionibacteriaceae. 
Within Coriobacteriales, Collinsella was retrieved from all but V5 
samples whereas in samples V2 and V4 it was also recruited from 
the FS fraction (see Supplemental Information, Figure 2 - 
Actinobacteria). 

Discussion 

It is now widely accepted that most of the relationships between 
micro-organisms and higher organisms (e.g., plants, animals and 
others) are most probably commensal or mutualistic 31 . In this frame, 
there is growing evidence that the immune system evolved accom- 
modating bacterial colonization of growing complexity modulating 
its reactivity 18,3233 . In this work, we show that taxonomic distribution 
obtained from active and S-IgA-coated bacterial fractions isolated 
from healthy human volunteers' faecal samples are highly dissimilar 
from those obtained from DNA extracted directly from faecal mater- 
ial. Generally speaking, what we observed in FS is coherent with 
results obtained in several other studies concerning gut microbiota 
distribution, in which stool samples are used 34 . For example, there are 
previous reports that dominant OTUs are the ones which are more 
widespread among samples 23 . Firmicutes and Bacteroidetes are the 
main taxa commonly retrieved from all human stool samples and 
their ratio is often influenced by diet 35 . As also observed by Tap and 
collaborators 34 , several dominant taxa belonging to Lachnospiraceae, 
Clostridiaceae or Bacteroidaceae have commonly been found in the 
core microbiota of FS fractions of all samples, as also in this work. In 
addition, the use of flow cytometry-based sorting to obtain fractions, 
with subsequent pyrosequencing, enabled us to access active and S- 
IgA-coated bacterial fractions. 

It may seem contradictory that many bacteria found in the active 
and S-IgA-coated fractions were not detected in the faecal controls. 
The reason for this lack of detection is probably related to the lack of 
amplification of rare bacteria, as PCR is biased to preferentially 



amplify the most common DNA templates in the sample 36 . Thus, 
the dominance of some bacterial taxa in faecal samples masks an 
important part of the diversity, which includes active and IgA-opso- 
nised bacteria. 

We found that overall the ratio of Bacteroidetes/Pirmicutes recruit- 
ments is maintained in FS, Act and IgA microbiota but, in the latter 
two fractions, their presence decreased in favour of other phyla, such 
as Proteobacteria and Actinobacteria. Thus employing the RNA/per 
cell content as a measure of bacterial activity 25-26 , we show that 
several still unidentified or very rare taxa are indeed active players 
of human microbiota and in strict interaction with the immune 
system. In this context, when active and S-IgA-coated bacterial frac- 
tions were characterized, we were able to widen the microbiota core 
to members of Proteobacteria including Sphingomonadaceae and 
Pseudomonadaceae, while, in a less restricted core (in five out of 
six samples), we also found members of Actinobacteria. The presence 
of core S-IgA-coated bacteria is coherent with observations that 
the immune system is continuously stimulated by commensal 
bacteria 37,38 . 

Within Firmicutes, we found that active core microbiota includes 
members of Blautia whose species B. productia has been tested as 
active fermenter in simplified human intestinal microbiota experi- 
ments using germ-free rats 39 . Other members of Firmicutes, such as 
Lactobacillus or Enterococcus, which are important due to their role 
as immune system modulators or for helping intestinal absorption 40 , 
have been commonly found as part of active and IgA fractions. 

Regarding Bacteroides, strict anaerobes such as members of genus 
Prevotella, known as cellulose and xylane degraders, were already 
found as part of the core microbiota of children whose diet was high 
in fibre content 35 . In our experiment we found Prevotella as core 
bacteria only when we focused our attention on active or S-IgA- 
coated bacteria. Thus, although members of this group are consid- 
ered rare, they are commonly found to be active. 
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A very interesting finding is the case of the phylum Proteobacteria, 
whose members are almost invisible when DNA extraction from 
faecal samples is used without the application of any selection or 
sorting protocol. In our work, we pointed out that Proteobacteria 
are recurrent active members of gut microbiota, as well as in strict 
interaction with the immune system. Sphingomonadales members 
(from Alphaproteobacteria) are important players shaping the 
immune system. Sphingomonas sp. was highly retrieved in four out 
of six samples while it was found in less than 1% in frequency in the 
other two samples. Studies in germ-free mice supplemented with 
conventional or restricted gut microorganisms show that glycosyl- 
ceramides, which are a component of Sphingomonas cell wall, are 
identified as CD Id ligands stimulating invariant natural killer T 
cells 18,41,42 . We observed that several other Sphingomonadales families 
appear to be represented in the core microbiota, including the genera 
Porphyrobacter, Novosphingobium, Sphingomonas and others not 
well characterized yet but falling within the Sphingomonadaceae 
family. 

Other commonly retrieved Proteobacteria retrieved from all frac- 
tions were members of Beta class, belonging to the order 
Burkholderiales. Parasutturella and Sutturella have been linked to 
healthy flora in studies of lean and obese mice 43 . Burkholderia were 
also commonly retrieved from active and S-IgA-coated microbial 
fractions but never in FS fraction probably due to their reduced 
frequency overwhelmed by dominant bacteria when a standard 
stool-based approach is applied. Burkholderia is often considered 
to be an opportunistic pathogenic bacteria but its commensal role 
makes it apt to survive in the intestine without generating infection 
processes. Hutchinson and collaborators reported Burkholderia as 
strict aerobic bacteria although viable colonies have also been 
reported to appear after days of anaerobic incubation 44 . This could 
explain its scarce presence and recruitment from the gastrointestinal 
tract in active and S-IgA-coated fractions. 

Escherichia/Shigella related Enterobacteriaceae were found in 
almost all samples and fractions. We also found Alcanivorax-telated 
members, which are known to be mostly marine, active obligate oil- 
degrading bacteria 45 . Although its 16S rDNA homology with 
the closest member in GenBank is around 96% (data not shown), 
it is surprising to find it in active or S-IgA-coated fractions of 
almost all samples. Another Gammaproteobacteria identified was 
Acinetobacter (family Moraxellaceae) although its 16S similarity is 
far from A. baumani (97% similarity) and closer (99%) to other 
unidentified or uncultured Acinetobacter (data not shown). In a 
recent study in mice Acinetobacter, Stenotrophomonas and 
Comamonas are hypothesized to be part of the so-called "crypt-spe- 
cific core microbiota" 46 . Here we found that these groups of bacteria 
are almost always recruited from human active microbiota or from 
human S-IgA-coated microbiota. Some genera were commonly 
found in Act and IgA fractions but they seem to be hidden when 
considering FS fractions. 

Moreover, Actinobacteria were commonly recruited at frequencies 
lower than 1%. In their review, Turroni and collaborators 47 described 
Actinobacteria to have been commonly retrieved from faecal samples 
using classical culture-dependent approaches, but they are rarely 
found in metagenomics-based experiments 2,48,49 . Likewise, in our 
samples, Actinobacteria were almost absent in FS fractions while they 
were low-frequency recruited in active and/or S-IgA-coated bacterial 
fractions. Thus, their absence in the FS fraction cannot be attributed 
to some biases in the DNA extraction method (commonly applied to 
all fractions) or to the primers used for PCR amplification, but rather 
a result of the sequencing optimization when a specific fraction is 
selected using the flow-cytometry-based sorting approach. 

In conclusion, the proposed approach enabled us to obtain 16S 
rDNA amplicons from genomic DNA of active bacterial cells, 
selected and sorted by flow cytometry. Compared to other methodo- 
logies where active bacteria are identified by retrotranscription of 



ribosomal RNA, the taxonomic distributions observed in this work 
are not biased by the rRNA copy number, found in each cell, while it 
is possibly biased only by the chromosomal 16S rDNA copies 
(chromosomal 16S rDNA). As already pointed out in our previous 
work 21 , the active fraction of gut microbiota represents a different 
approach to studying the human microbiota. The combination of 
this approach with the characterization of bacteria opsonised by 
different human immunoglobulins will allow future study of the 
active bacteria ignored by the immune system, as well as those act- 
ively growing but opsonised by different Ig types. We believe this will 
provide important insights into the host-microbiota interplay under 
both healthy and disease-related conditions. 

Finally, we describe the presence of a microbiota core among 
faecal samples of six healthy volunteer samples. This core contains 
members of taxa already described in previous works; however, 
focusing the analysis on fractions of active or S-IgA-coated bacteria 
highlights the presence of other "important" taxa, which were prev- 
iously not visible. For instance, we found that in addition to the 
already known members of the gut microbiota core, such as 
Firmicutes and Bacteroidetes, other active bacteria such as members 
of Actinobacteria and Proteobacteria (including genera of the 
Sphingomonadaceae and Moraxellaceae families) may play an 
important role in gut homeostasis that has yet to be elucidated. 

Methods 

Samples and fractions. Samples were obtained from six healthy volunteers (between 
20 and 36 years old, three male and three female) identified as VI, V2, V3, V4, V5 and 
V6. All participants expressed their interest in participating in this study by signing an 
informed consent form, approved by the Ethics and Research Committee of Centre 
for Public Health Research (CSISP) of Valencia, Spain. One of the main conditions of 
exclusion was that no antibiotics had been administrated during the last two months 
prior to sampling. None of the volunteers had organic intestinal disorders. All 
volunteers follow a Mediterranean diet. 

For each faecal sample, we studied the taxonomical distribution of faecal suspen- 
sion (FS), active bacteria (Act) and S-IgA-coated bacterial fractions (IgA). 

The volunteers collected faecal material in sterile 30 ml screw-cap containers 
(25690 mm; PP SPOON; DELTALAB), containing 8 ml RNAlater (Ambion 
#AM7020) in order to preserve RNA. Samples were kept at — 25 "C, delivered to the 
lab within the next 24 hours and immediately processed. From each sample, the 
faecal material was resuspended by vortexing (2 min). It was washed twice in 
physiological solution (NaCl 0.9%). Faecal suspension was centrifuged (800 g) for 
2 min to pellet big aggregates. Then, supernatant was centrifuged at 7500 g for 7 min 
to collect microbial cells from faecal suspension. Pellet was washed twice in physio- 
logical solution (NaCl 0.9%). Cells were immediately fixed adding 1/10 volume of 
37% formaldehyde (final concentration: 3.7%) and incubated over-night at 4°C. Fixed 
cells were washed twice to remove residual formaldehyde and resuspended in 0.1 ml 
of physiological solution. The samples were stored at — 20°C after adding 1 volume of 
absolute ethanol. Three fractions were obtained from each sample stored in the 
previous step. The DNA in the first aliquot was extracted prior to flow cytometry steps 
(FS fraction). The second and third fractions were obtained by flow cytometry sorting. 
Thus we obtained: the active population by the mean of RNA content and the S-IgA- 
coated population using anti-human IgA immunoglobulin. For staining and flow 
cytometry protocols, fixed cells from previous steps were washed and diluted to 
achieve an optical density (O.D. 600) around 0.2 using physiological solution. 

Staining. Cell labelling and flow cytometry sorting. Before sorting, cell suspensions 
were disaggregated by mild sonication and filtering, so cells ran freely in the 
microfluidic system (see SI Figure 3). Ten microlitres of pyronin-Y (Sigma- Aldrich, 
#P9172, 10 mg/ml) diluted to 100 uM was added to the samples (1 ml of volume) for 
total RNA staining and incubated for 1 hour at 4°C. The samples were then stained 
with SYT062 (Invitrogen, #S11344) according to manufacturer instructions (final 
concentration of 0.5 uM) in order to distinguish the bacteria from the noise during 
the flow cytometry, by the mean of their DNA content. S-IgA-coated bacterial 
staining was performed using anti-human IgA labelled with FITC (Invitrogen, #62- 
7411). Anti-mouse IgA labelled with FITC (Invitrogen, #M31001) was used for 
isotype control (see Supplemental Information, Figures 3 and 4). 

Flow cytometry sorting was carried out using the MoFlo™ XDP Cell Sorter. The 
light sources were the Argon 488 nm (blue) laser (200 mW power) and the 635 nm 
(red) diode laser (25 mW power). The lasers were aligned using Flow-Check™ 
(10 urn) and Flow-Set™ (3 um). The cytometer emission filter was 520/30 (FL1), 580/ 
30 (FL2) and 680/30 (FL4) obtaining emission for FITC, Pyronin Y and SYT062, 
respectively. The trigger was set on side-scatter. 

DNA extraction and sequencing. DNA extraction from all fractions was carried out 
using the CTAB method 50 . Total 1 6S rDNA was amplified from each fraction using 8F 
and 530R universal primers for bacteria 51 using multiplex identifiers (MIDs, 
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Supplemental Information, Table 2). PCR products obtained from each fraction/ 
sample were purified by Nucleofast 96 PCR filter plates (Macherey Nagel #74310050) 
and concentrations were measured by PicoGreen assay. The products were pooled 
obtaining balanced final concentrations and sequenced using the 454 GS-FLX 
pyrosequencer (Titanium chemistry, Roche). 

Bioinformatics. Obtained sequences were trimmed from the end in sliding windows 
of 10 nucleotides when the average quality value was lower than 20, using Prinseq 
(v0.19.4) software 52 . This considerably improved the quality of the reads, as the 
quality of pyrosequencing reads has been shown to dramatically decrease towards the 
end of the sequences 53 . All sequences shorter than 200 nucleotides were not 
considered. Taxonomic assignations were carried out using the RDP_classifier 54 , and 
phylogenetic ranks were assigned when scores exceeded 0.8. Clustering was 
performed by the use of CD-HIT software 55 . 

Percentage distributions of fractions coming from flow cytometry sorting (Act and 
IgA) were multiplied by the cell abundance rate of the sorted fraction with respect to 
the total amount of cells counted during the whole sorting (see Table 1 column 
"Fraction rate"). Percentage values of FS fractions remained unaltered. 

Descriptive and statistical analyses were carried out with the R statistic envir- 
onment using Vegan R package 56,57 . Venn diagrams for pan-microbiota analysis were 
obtained using Vennerable R package 58 . Flow cytometry data were analysed using R 
package flowCore and flowViz from Bioconductor 59-61 . 

Accession numbers. Sequences were deposited in EMBL-EBI Sequence Read Archive 
(SRA) under study number ERP002046. 
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